A Flexible Fuzzy Expert System for Fuzzy Duplicate Elimination in Data Cleaning

نویسندگان

  • Hamid Haidarian Shahri
  • Ahmad Abdollahzadeh Barforoush
چکیده

Data cleaning deals with the detection and removal of errors and inconsistencies in data, gathered from distributed sources. This process is essential for drawing correct conclusions from data in decision support systems. Eliminating fuzzy duplicate records is a fundamental part of the data cleaning process. The vagueness and uncertainty involved in detecting fuzzy duplicates make it a niche, for applying fuzzy reasoning. Although uncertainty algebras like fuzzy logic are known, their applicability to the problem of duplicate elimination has remained unexplored and unclear, until today. In this paper, a novel and flexible fuzzy expert system for detection and elimination of fuzzy duplicates in the process of data cleaning is devised, which circumvents the repetitive and inconvenient task of hard-coding. Some of the crucial advantages of this approach are its flexibility, ease of use, extendibility, fast development time and efficient run time, when used in various information systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Eliminating Fuzzy Duplicates in Data Warehouses

1 Work done while visiting Microsoft Research Abstract The duplicate elimination problem of detecting multiple tuples, which describe the same real world entity, is an important data cleaning problem. Previous domain independent solutions to this problem relied on standard textual similarity functions (e.g., edit distance, cosine metric) between multi-attribute tuples. However, such approaches ...

متن کامل

A Fuzzy Expert System for Predicting the Performance of Switched Reluctance Motor

In this paper a fuzzy expert system for predicting the performance of a switched reluctance motor has been developed. The design vector consists of design parameters, and output performance variables are efficiency and torque ripple. An accurate analysis program based on Improved Magnetic Equivalent Circuit (IMEC) method has been used to generate the input-output data. These input-output data i...

متن کامل

A Fuzzy Expert System & Neuro-Fuzzy System Using Soft Computing For Gestational Diabetes Mellitus Diagnosis

Gestational diabetes mellitus (GDM) is a kind of diabetes that requires persistent medical care in patient self management education to prevent acute complications. One of the common and main problems in diagnosis of the diabetes is the weakness in its initial stages of the illness. This paper intends to propose an expert system in order to diagnose the risk of GDM by using FIS model. The knowl...

متن کامل

A Fuzzy Expert System & Neuro-Fuzzy System Using Soft Computing For Gestational Diabetes Mellitus Diagnosis

Gestational diabetes mellitus (GDM) is a kind of diabetes that requires persistent medical care in patient self management education to prevent acute complications. One of the common and main problems in diagnosis of the diabetes is the weakness in its initial stages of the illness. This paper intends to propose an expert system in order to diagnose the risk of GDM by using FIS model. The knowl...

متن کامل

A Flexible Link Radar Control Based on Type-2 Fuzzy Systems

An adaptive neuro fuzzy inference system based on interval Gaussian type-2 fuzzy sets in the antecedent part and Gaussian type-1 fuzzy sets as coefficients of linear combination of input variables in the consequent part is presented in this paper. The capability of the proposed method (we named ANFIS2) for function approximation and dynamical system identification is remarkable. The structure o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004